AITopics | multimodal sentiment analysis

Collaborating Authors

multimodal sentiment analysis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities

Neural Information Processing SystemsJun-22-2026, 20:22:54 GMT

Multimodal Sentiment Analysis (MSA) aims to infer human emotions by integrating complementary signals from diverse modalities. However, in real-world scenarios, missing modalities are common due to data corruption, sensor failure, or privacy concerns, which can significantly degrade model performance. To tackle this challenge, we propose Hyper-Modality Enhancement (HME), a novel framework that avoids explicit modality reconstruction by enriching each observed modality with semantically relevant cues retrieved from other samples. This cross-sample enhancement reduces reliance on fully observed data during training, making the method better suited to scenarios with inherently incomplete inputs. In addition, we introduce an uncertainty-aware fusion mechanism that adaptively balances original and enriched representations to improve robustness. Extensive experiments on three public benchmarks show that HME consistently outperforms state-of-the-art methods under various missing modality conditions, demonstrating its practicality in real-world MSA applications.

artificial intelligence, natural language, representation, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.61)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.61)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.48)

Add feedback

Improving Task-Specific Multimodal Sentiment Analysis with General MLLMs via Prompting

Neural Information Processing SystemsJun-21-2026, 14:18:05 GMT

Multimodal Sentiment Analysis (MSA) aims to predict sentiment from diverse data types, such as video, audio, and language. Recent progress in Multimodal Large Language Models (MLLMs) have demonstrated impressive performance across various tasks. However, in MSA, the increase in computational costs does not always correspond to a significant improvement in performance, raising concerns about the cost-effectiveness of applying MLLMs to MSA. This paper introduces the MLLMGuided Multimodal Sentiment Learning Framework (MMSLF). It improves the performance of task-specific MSA models by leveraging the generalized knowledge of MLLMs through a teacher-student framework, rather than directly using MLLMs for sentiment prediction. First, the proposed teacher built upon a powerful MLLM (e.g., GPT-4o-mini), guides the student model to align multimodal representations through MLLM-generated context-aware prompts. Then, knowledge distillation enables the student to mimic the teacher's predictions, thus allowing it to predict sentiment independently without relying on the context-aware prompts. Extensive experiments on the SIMS, MOSI, and MOSEI datasets demonstrate that our framework enables task-specific models to achieve state-of-the-art performance across most metrics. This also provides new insights into the application of general MLLMs for improving MSA.1

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education > Educational Technology > Educational Software (0.35)
Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)

Add feedback

CyIN: Cyclic Informative Latent Space for Bridging Complete and Incomplete Multimodal Learning

Neural Information Processing SystemsJun-15-2026, 06:42:35 GMT

Multimodal machine learning, mimicking the human brain's ability to integrate various modalities has seen rapid growth. Most previous multimodal models are trained on perfectly paired multimodal input to reach optimal performance. In real-world deployments, however, the presence of modality is highly variable and unpredictable, causing the pre-trained models in suffering significant performance drops and fail to remain robust with dynamic missing modalities circumstances. In this paper, we present a novel Cyclic INformative Learning framework (CyIN) to bridge the gap between complete and incomplete multimodal learning. Specifically, we firstly build an informative latent space by adopting token-and label-level Information Bottleneck (IB) cyclically among various modalities. Capturing task-related features with variational approximation, the informative bottleneck latents are purified for more efficient cross-modal interaction and multimodal fusion. Moreover, to supplement the missing information caused by incomplete multimodal input, we propose cross-modal cyclic translation by reconstruct the missing modalities with the remained ones through forward and reverse propagation process. With the help of the extracted and reconstructed informative latents, CyIN succeeds in jointly optimizing complete and incomplete multimodal learning in one unified model. Extensive experiments on 4 multimodal datasets demonstrate the superior performance of our method in both complete and diverse incomplete scenarios.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States > New York (0.28)
North America > United States > Minnesota (0.27)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities

Neural Information Processing SystemsJun-14-2026, 04:51:55 GMT

artificial intelligence, natural language, proceedings, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications > Social Media (0.32)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.32)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.32)

Add feedback

Improving Task-Specific Multimodal Sentiment Analysis with General MLLMs via Prompting

Neural Information Processing SystemsJun-13-2026, 19:06:59 GMT

Multimodal Sentiment Analysis (MSA) aims to predict sentiment from diverse data types, such as video, audio, and language. Recent progress in Multimodal Large Language Models (MLLMs) have demonstrated impressive performance across various tasks. However, in MSA, the increase in computational costs does not always correspond to a significant improvement in performance, raising concerns about the cost-effectiveness of applying MLLMs to MSA. This paper introduces the MLLM-Guided Multimodal Sentiment Learning Framework (MMSLF). It improves the performance of task-specific MSA models by leveraging the generalized knowledge of MLLMs through a teacher-student framework, rather than directly using MLLMs for sentiment prediction. First, the proposed teacher built upon a powerful MLLM (e.g., GPT-4o-mini), guides the student model to align multimodal representations through MLLM-generated context-aware prompts. Then, knowledge distillation enables the student to mimic the teacher's predictions, thus allowing it to predict sentiment independently without relying on the context-aware prompts. Extensive experiments on the SIMS, MOSI, and MOSEI datasets demonstrate that our framework enables task-specific models to achieve state-of-the-art performance across most metrics. This also provides new insights into the application of general MLLMs for improving MSA.

artificial intelligence, large language model, natural language, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)

Add feedback

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Neural Information Processing SystemsMar-20-2026, 22:55:46 GMT

The field of Multimodal Sentiment Analysis (MSA) has recently witnessed an emerging direction seeking to tackle the issue of data incompleteness. Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA. The proposed LNLN features a dominant modality correction (DMC) module and dominant modality based multimodal learning (DMML) module, which enhances the model's robustness across various noise scenarios by ensuring the quality of dominant modality representations. Aside from the methodical design, we perform comprehensive experiments under random data missing scenarios, utilizing diverse and meaningful settings on several popular datasets (e.g., MOSI, MOSEI, and SIMS), providing additional uniformity, transparency, and fairness compared to existing evaluations in the literature. Empirically, LNLN consistently outperforms existing baselines, demonstrating superior performance across these challenging and extensive evaluation metrics.

artificial intelligence, natural language, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications > Social Media (0.32)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.32)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.32)

Add feedback

d611d5c0251d9680f869c5d2c46c6fcd-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-18-2026, 07:22:57 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Tennessee > Shelby County > Memphis (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Zhejiang Province (0.04)
Africa > South Sudan > Greater Upper Nile > Greater Pibor Administrative Area > Boma (0.04)

Industry:

Education (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
Information Technology (0.67)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
(3 more...)

Add feedback

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Neural Information Processing SystemsFeb-15-2026, 12:18:21 GMT

Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.41)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.41)

Add feedback

Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning Mingcheng Li1,3 Dingkang Y ang 1,3 Y ang Liu 1 Shunli Wang 1,3

Neural Information Processing SystemsFeb-10-2026, 20:28:52 GMT

To this end, we propose a Hierarchical Representation Learning Framework (HRLF) for the MSA task under uncertain missing modalities.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
Asia > Middle East > Israel (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (0.46)
Information Technology (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

MIDG: Mixture of Invariant Experts with knowledge injection for Domain Generalization in Multimodal Sentiment Analysis

Li, Yangle, Luo, Danli, Hu, Haifeng

arXiv.org Artificial IntelligenceDec-9-2025

Existing methods in domain generalization for Multimodal Sentiment Analysis (MSA) often overlook inter-modal synergies during invariant features extraction, which prevents the accurate capture of the rich semantic information within multimodal data. Additionally, while knowledge injection techniques have been explored in MSA, they often suffer from fragmented cross-modal knowledge, overlooking specific representations that exist beyond the confines of unimodal. To address these limitations, we propose a novel MSA framework designed for domain generalization. Firstly, the framework incorporates a Mixture of Invariant Experts model to extract domain-invariant features, thereby enhancing the model's capacity to learn synergistic relationships between modalities. Secondly, we design a Cross-Modal Adapter to augment the semantic richness of multimodal representations through cross-modal knowledge injection. Extensive domain experiments conducted on three datasets demonstrate that the proposed MIDG achieves superior performance.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2512.0743

Country:

North America (0.28)
Oceania > Australia (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.74)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback